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PREFACE 


This  work  is  currently  supported  by  the  Air 
Force  Human  Resources  Laboratory  and  Air  Force  Wright 
Aeronautical  Laboratories  at  Wright-Patterson  Air 
Force  Base  under  Contract  Number  F33615-82-C-0002 , 
Impact  Analysis  of  1CNIA.  The  guidance  and  support 
of  Mr.  James  C.  McManus  and  Mr.  Robert  L.  Harris  of 
these  organizations  are  greatly  appreciated.  The 
methodologies  developed  in  this  report  to  analyze 
reliability  and  supportability  of  integrated,  fault- 
tolerant  avionics  will  be  applied  to  specific  ICNIA 
architectures  in  additional  reports  prepared  under 
this  contract. 
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1 .  INTRODUCTION 


1 . 1  BACKGROUND 

The  growing  requirement  for  tactical  aircraft  Com¬ 
munication,  Navigation  and  Identification  (CNI)  avionics  in 
the  presence  of  volume,  weight,  power  and  cost  constraints 
is  currently  forcing  avionics  designers  to  consider  system 
integration  (Reference  1).  Fault  tolerance  is  one  feature 
that  an  Integrated  CNI  Avionics  (ICNIA)  system  must  have  if 
reliability  and  support  cost  benefits  are  to  be  realized. 
Exploring  the  reliability,  supportability  and  survivability 
implications  of  an  integrated,  fault- tolerant  architecture 
requires  new  techniques  (Reference  2). 

Historically,  logistics  engineering  disciplines 
have  been  applied  to  new  avionics  designs  in  the  later 
stages  of  development.  To  ensure  that  avionics  designs  are 
reliable,  supportable  and  survivable  in  the  operating  envi¬ 
ronment,  logistics  engineering  techniques  are  needed  that 
can  be  effectively  implemented  during  the  advanced  design 
phase  of  the  system  development  cycle.  Techniques  employed 
in  this  phase  will  challenge  design  engineers  to  provide 
logistics  support,  reliability  and  survivability  capabili¬ 
ties  before  the  design  is  fixed.  In  particular,  logistics 
engineering  techniques  are  needed  that  do  not  impose  unreal¬ 
istic  detailed  data  requirements  during  the  earlier  stages 
of  design. 

The  combination  of  these  two  factors  creates  a  need 
for  new  and  innovative  logistics  engineering  techniques. 

The  need  currently  exists  in  the  two  ICNIA  system  development 
programs  that  are  being  pursued  at  the  Air  Force  Wright  Aero¬ 
nautical  Laboratories  (AFWAL).  One  program  (System  A)  uses 
agile  bandpass  filter  technology,  and  the  other  (System  B) 
employs  analog  large  scale  integration  technology. 


1 . 2  OVERVIEW 

The  logistics  analysis  methods  presented  in  this 
paper  are  appropriate  for  integrated,  fault- tolerant  systems, 
such  as  ICNIA,  early  in  the  development  cycle.  Traditional 
and  innovative  maintenance  concepts  are  investigated.  In 
particular,  the  increased  ability  to  sustain  sorties  with 
limited  repair  capability  is  evaluated  for  deferred  repair 
policies.  A  detailed  example  is  presented  to  demonstrate 
the  reliability  and  supportability  methodology. 


5 


These  techniques  were  developed  under  the  Impact 
Analysis  of  ICNIA  Program.  The  program  has  the  following 
additional  goals: 

1 .  Apply  these  techniques  to  the  two 
ICNIA  architectures  under  development. 

2.  Influence  the  ICNIA  designs  to  improve 
reliability,  supportability  and  sur¬ 
vivability  . 

3.  Document  the  research  and  development 
results  in  a  form  amenable  for  use  by 
design  engineers. 

An  overview  of  the  Impact  Analysis  of  ICNIA  Program  is  shown 
in  Figure  1.  Research  in  the  reliability,  supportability 
and  survivability  areas  was  preceded  by  front-end  analyses 
to  determine  the  applicability  of  existing  techniques.  The 
output  of  the  research  in  each  area  consists  of  documented 
methods  for  evaluation  of  integrated,  fault- tolerant  designs 
and  the  associated  logistics  options,  as  well  as  specific 
evaluations  and  design  feedback  for  the  ICNIA  designs. 


Figure  1.  Overview  of  Impact  Analysis  of  ICNIA 
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1.3  ORGANIZATION  OF  THIS  PAPER 

The  Impact  Analysis  of  ICN1A  program  is  concerned 
with  three  major  factors:  reliability,  logistics  support 
and  survivability.  The  methodology  in  each  area  draws  on 
a  common  representation  of  the  system.  The  reliability  meth¬ 
odology  is  presented  in  Section  2.  The  system  architecture 
representation,  which  is  relevant  to  all  three  areas,  is 
introduced,  and  an  example  is  presented  and  analyzed  exten¬ 
sively.  Section  3  presents  the  logistics  support  analysis 
methodology.  The  same  example  is  analyzed  far  supportabil- 
ity.  Interim  conclusions  and  recommendations  are  stated  in 
Section  4. 

Application  of  these  methodologies  to  ICNIA  Sys¬ 
tems  A  and  B  will  be  reported  in  References  3  and  4,  respec¬ 
tively.  Design  feedback  for  the  architectures  will  be 
provided  in  these  reports. 


2.  RELIABILITY  ANALYSIS 


The  fault  tolerance  of  ICNIA,  achieved  through  dy¬ 
namic  reconfigurability,  makes  the  analysis  of  system  reli¬ 
ability  more  complex  than  for  traditional  systems.  The 
integration  of  many  radio  functions  creates  interdependent 
failure  modes  that  are  not  well  described  by  existing  meas¬ 
ures  of  reliability.  As  a  result,  new  measures  of  effective¬ 
ness  are  needed. 

The  applicability  of  previous  work  is  examined  in 
Section  2.1.  A  reliability  methodology  is  then  presented 
that  includes  development  of  fault  tolerance  indices  and 
identification/classification  of  failure  modes  in  a  mission 
scenario.  Mission  scenarios  are  discussed  in  Section  2.3. 

An  example  architecture  is  presented  in  Section  2.4  and 
analyzed  in  Section  2.5.  Some  conclusions  are  drawn  in  Sec¬ 
tion  2.6. 


2.1  FRONT-END  STUDY  FINDINGS 

A  front-end  study  was  conducted  to  ascertain  the 
applicability  of  existing  reliability  analysis  techniques  to 
ICNIA- type  systems.  The  primary  focus  was  to  review  the 
features  of  reliability  models  and  procedures  currently  in 
use  by  the  military  services.  Following  is  a  brief  summary 
of  the  techniques  surveyed. 

MIL-HDBK-217D  Reliability  Prediction  of 

Electronic  Equipment 

This  handbook  is  used  for  reliability  estimation  of 
individual  components.  Failure  rates  are  estimated  based  on 
parts  count  and  a  stress  analysis.  While  this  procedure  is 
applicable  to  individual  components,  it  does  not  address 
system  structure,  which  is  the  key  to  fault  tolerance. 

MIL-STD-756  Reliability  Prediction 

This  standard  is  used  for  system  reliability  predic¬ 
tion.  Conventional  combinatoric  probability  is  used  to  relate 
series/parallel  structures  to  mission,  or  system,  reliability. 
The  reconf igurable  aspect  of  ICNIA- type  systems  is  not  captured. 

DEPEND 


The  Determination  of  Equipment  Performance  and  Ex¬ 
pected  Nonoperational  Delay  (DEPEND)  (Reference  5)  models 
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reliability  and  availability  for  redundant  systems  with  back¬ 
up  modes  of  operation.  The  model  considers  the  fault  toler¬ 
ance  achieved  through  redundancy  but  not  through  the  sharing 
of  resources  in  an  integrated  system.  As  a  result,  the  analy 
sis  of  dynamically  reconf igurable  systems  is  limited. 

AEP 


The  Avionics  Evaluation  Program  (AEP)  (Reference  6) 
estimates  mission  success  and  abort  rates,  as  well  as  costs. 
The  model  is  essentially  a  Monte  Carlo  simulation  of  flight 
operations  in  a  specified  scenario.  Redundancy  is  modeled 
at  the  subsystem  level.  Component  redundancy,  integrated 
systems  and  dynamic  reconfiguration  are  not  addressed.  In 
addition,  the  magnitude  of  the  model  makes  it  inappropriate 
as  an  interactive  design  tool. 

None  of  the  models  reviewed  appear  adequate  in  the 
area  of  representing  integrated,  reconf igurable  systems. 

The  literature  on  reliability  theory  of  complex  systems  was 
also  reviewed.  The  framework  of  structural  reliability  as 
developed  algebraically  by  Birnbaum,  et  al ,  (Reference  7), 
or  the  equivalent  fault-tree  approach  (Reference  8),  applies 
to  these  systems.  However,  existing  computational  techniques 
such  as  those  in  Reference  9,  seem  inadequate  for  dealing 
with  the  complex  system  structures  needed  to  realistically 
model  the  ICN1A  systems. 

One  approach  which  has  been  taken  to  avoid  the  com¬ 
putational  limits  on  reliability  structures  is  Monte  Carlo 
analysis.  Even  this  approach  requires  the  mapping  from  point 
failures  into  system  failure.  No  suitable  approach  to  de¬ 
fining  this  mapping  for  detailed  ICNIA-type  systems  is  avail¬ 
able.  Some  progress  in  this  area  has  been  made  by  the  ICNIA 
System  A  and  B  contractors.  In  particular,  construction  of 
the  mapping  has  been  avoided  by  the  System  B  contractor  by 
building  a  Monte  Carlo  simulation  around  the  system  control 
algorithm,  which  would  determine  whether  a  system  failure 
occurred  for  each  point  failure  that  occurred.  However, 
this  approach  does  not  lend  itself  to  use  as  a  reliability 
design  tool  in  the  early  phases  of  development.  The  need 
for  detailed  data  concerning  the  dynamic  operating  environ¬ 
ment  and  the  system  controller,  coupled  with  high  computer 
run  times,  makes  such  a  model  cumbersome  to  use. 

The  primary  conclusion  of  the  front-end  study  was 
that  the  existing  reliability  techniques  did  not  satisfy  all 
of  the  analysis  requirements  for  ICNIA-type  systems.  As  a 
consequence,  an  essentially  new  methodology  was  developed 
and  is  described  below. 
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2.2 


METHODOLOGY 


This  section  introduces  the  methodology  for  analy¬ 
zing  reliability  of  integrated,  fault- tolerant  systems. 
First,  measures  of  effectiveness  are  defined.  Next,  a  method 
of  representing  such  systems  by  a  structural  reliability 
model  is  presented.  Finally,  computational  techniques  for 
the  structural  reliability  model  are  developed.  An  overview 
of  the  model  is  provided  at  the  end  of  the  section. 

Measures  of  Mission  Reliability 

Because  of  the  multiplicity  of  functions  supported 
by  ICN1A  and  their  varying  importance  to  different  missions, 
a  combined  measure  of  effectiveness  for  mission  reliability 
is  needed.  We  define  Mission  Completion  Success  Probability 
(MCSP)  as  the  probability  that  a  given  set  of  critical  func- 
tions  is  available  throughout  a  given  mission.  A  related 
measure  is  Mean  Time  Between  Critical  Failure  (MTBCF),  where 
a  critical  failure  is  a  failure  or  a  combination  of  failures 
that  make  a  critical  function  unavailable.  These  measures 
are  meaningful  in  a  mission  context  where  a  set  of  CN1  func¬ 
tions  are  considered  critical  for  mission  success.  It  is 
assumed  that  no  repair  action  is  taken  between  critical 
failures.  When  a  single  function  is  being  considered  as 
critical,  MTBCF  will  be  referred  to  as  Mean  Time  Between 
Function  Failure  (MTBFF).  Thus,  the  two  measures  are  inter¬ 
changeable  when  only  a  single  function  of  the  complete  set 
of  CNI  functions  is  considered  critical.  A  useful  index  of 
fault  tolerance  is  failure  resiliency,  defined  as  the  ratio 
of  MTBCF  (or  MTBFF)  to  the  traditional  Mean  Time  Between 
Failure  (MTBF).  Since  MTBF  refers  to  the  first  failure  in 
the  system,  failure  resiliency  is  greater  than  or  equal  to 
one.  Larger  failure  resiliency  values  correspond  to  systems 
with  a  higher  degree  of  fault  tolerance. 

A  single  function  is  considered  available  if  the 
system  controller  can  select  a  configuration  to  bring  the 
function  up,  with  a  specified  level  of  performance.  The 
availability  of  a  set  of  functions  is  complicated  by  the  com¬ 
petition  between  functions  for  resources.  System  resources 
are  modeled  as  discrete  "failure  units"  or  components.  A 
component  fails  as  a  unit  and  is  monitored  individually  by 
the  system  controller  for  reconfiguration  purposes.  Component 
requirements  vary  over  time  depending  on  the  presence  of  a 
signal  or  pilot  input.  The  time  history  of  component  util¬ 
ization  can  also  be  scheduled  by  the  controller  within  certain 
tolerances.  Thus,  dynamic  reconfigurability  makes  it  diffi¬ 
cult  to  determine  whether  functions  conflict. 


Structural  Reliability  Formulation 

A  practical  approach  to  determining  function  avail¬ 
ability  is  to  classify  components  based  on  their  dynamic 
features  and  then  represent  them  accordingly  in  a  static 
model  structure.  This  approach  makes  rapid  reliability  com¬ 
putations  possible  and  is  taken  in  this  study. 

Three  types  of  component  utilization  have  been 
identified: 

1.  Contending :  The  functions  are  avail¬ 
able  if  there  is  a  configuration  in 
which  separate  components  are  used  to 
perform  each  function. 

2.  Timesharing :  Each  function  utilizes  a 
component  a  fraction  of  the  time.  A 
set  of  functions  is  available  if  there 
is  a  configuration  in  which  no  component 
is  overloaded. 

3.  Noncontending :  The  functions  are  avail¬ 
able  if  there  are  sufficient  components 
for  each  individual  function. 

Components  are  contending  with  respect  to  certain  functions 
if  the  components  must  be  dedicated  constantly*  or  at  rigidly 
scheduled  times,  to  supporting  the  functions  (e.g.,  receivers 
used  to  monitor  communication  channels).  Components  are 
timeshared  if  they  are  utilized  by  a  function  at  flexibly 
scheduled  times  so  that  several  functions  can  be  interleaved 
(e.g.,  data  processors).  Resources  that  can  be  used  by  any 
number  of  functions  simultaneously,  such  as  power  supplies, 
are  always  noncontending. 

The  classification  of  components  as  contending,  non¬ 
contending,  or  timesharing  also  depends  on  the  times  during  a 
mission  at  which  each  function  is  required.  If  functions  are 
not  required  simultaneously,  all  components  are  noncontending 

Within  the  context  of  these  definitions,  dynamic 
reconfigurability  can  be  represented  by  a  structural  model 
which  gives  meaningful  measures  of  reliability  for  a  specific 
mission  type.  The  mission  is  characterized  by  the  functions 
required  and  the  simultaneity  of  these  functions. 
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Structural  Reliability  Computations 

In  order  to  compute  MCSP  for  a  given  mission  scenario 
with  specified  function  requirements,  the  mapping  from  system 
health  (the  state  of  each  component)  to  functional  capability 
is  needed.  Unfortunately,  traditional  approaches  to  evaluat¬ 
ing  this  mapping  (Reference  9)  are  practical  only  for  systems 
with  a  certain  modular  structure  that  does  not  apply  to  ICNIA 
architectures.  Furthermore,  it  is  desirable  to  represent 
this  mapping  for  individual  functions  rather  than  complete 
missions,  so  that  a  variety  of  missions  can  be  constructed 
from  a  single  data  base. 


For  the  ICNIA  architectures  that  have  been  exam¬ 
ined,  it  is  possible  to  take  advantage  of  the  special  struc¬ 
ture  of  this  mapping  to  compute  MCSP  efficiently.  The  com¬ 
putations,  as  implemented  in  the  Mission  REliability  Model 
(MIREM),  are  detailed  in  Appendix  A.  The  "basic  approach  is 
to  assume  a  structure  corresponding  to  two  levels  of  recon¬ 
figurability  or  switching.  This  type  of  structure  is  illus¬ 
trated  in  Figure  2. 

°  MiOH 


At  the  lowest  level,  pools  of  interchangeable  compo¬ 
nents  are  identified.  Each  function  utilizes  a  certain  number 
of  components  (or  fraction  of  a  component)  in  a  pool.  For 
pools  of  contending  or  timeshared  components,  the  total  re¬ 
quirement  for  a  pool  is  the  sum  of  the  utilizations  of  each 
required  function;  for  noncontending  components,  the  total 
requirement  is  the  maximum  function  utilization.  If  func¬ 
tions  are  not  required  simultaneously,  all  pools  are  con¬ 
sidered  noncontending.  MCSP  is  the  product  of  the  probabil¬ 
ities  of  each  pool  having  sufficient  components  operating. 

The  second  level  of  reconfiguration  is  between  paral¬ 
lel  chains .  A  chain  is  a  set  of  pools  that  is  switched  (re¬ 
configured)  as  a  group.  In  many  cases  a  chain  will  correspond 
to  a  Line  Replaceable  Unit  (LRU),  because  they  have  separate 
power  supplies  and  limited  inter-LRU  connections.  A  set  of 
functions  is  available  on  parallel  chains  if  there  is  an 
allocation  of  functions  to  chains  such  that  each  chain  can 
support  its  allocated  functions.  The  approach  to  evaluating 
MCSP  on  parallel  chains  consists  of  enumerating  all  possible 
allocations  of  functions  to  chains  (see  Appendix  A).  This 
approach  is  computationally  feasible  whereas  the  traditional 
enumeration  of  component  states  is  not,  the  difference  being 
that  there  are  many  more  components  than  required  functions. 

Total  system  MCSP  is  the  product  of  the  MCSP  for 
each  chain/parallel  chain  set.  Other  measures  of  effective¬ 
ness  can  be  derived  from  MCSP.  Of  particular  importance  are 
MTBCF,  which  is  computed  by  evaluating  and  numerically  inte¬ 
grating  MCSP  for  different  mission  durations,  and  failure 
resiliency,  which  is  defined  as  the  ratio  of  MTBCF  to  MTBF. 

The  reliability  analysis  methodology  is  summarized 
in  Figure  3.  System  structure  data  are  converted  to  files 
containing  the  pool  and  chain  data  needed  by  MIREM.  With 
the  additional  inputs  of  failure  rates,  mission  requirements 
and  initial  system  health,  MIREM  computes  measures  of  effec¬ 
tiveness  plus  LRU  failure  probabilities  for  use  in  the  log¬ 
istics  analysis. 


2.3  MISSION  SCENARIOS 

A  mission  can  be  described  by  a  time  sequence  of 
CNI  radio  system  or  function  requirements.  Several  factors 
affect  whether  the  operational  requirements  of  a  mission  can 
be  met  in  a  given  state  of  system  health: 
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MCSP 

MTBCF 

MTBFF 

LRU  FAILURE  PROBABILITIES 


*MI^ONI2ED  RELIABILITY  MODEL 

Figure  3.  Reliability  Analysis  Overview 


1.  The  set  of  critical  functions  (CF)  re¬ 
quired  for  the  mission. 

2.  The  combinations  of  these  functions 
that  are  required  simultaneously. 

3.  The  time  slots  during  which  resources 
must  be  used  to  process  signals  within 
the  interval  when  a  function  is  required. 

4.  The  time  response  required  when  a  func¬ 
tion  requirement  is  received  compared 
with  the  reconfiguration  speed  of  the 
system. 

The  last  two  factors  can  generally  be  modeled  by 
appropriate  classification  of  pools  as  contending  or  noncon 
tending,  and  selection  of  pool  capacity  requirements.  The 


first  two  factors  have  been  dealt  with  in  previous  efforts 
(Reference  10)  by  dividing  the  mission  into  phases,  each  of 
which  has  distinct  function  requirements.  In  the  current 
analysis,  a  single  set  of  functions  is  considered  for  two 
cases  of  simultaneity: 

(a)  All  functions  are  required  simultaneously. 

(b)  Each  function  is  required  independently. 

These  two  cases  bound  the  actual  mission  environment.-  The 
worst  case,  (a),  is  used  as  the  baseline  for  analysis. 

The  current  analysis  could  be  generalized  to  con¬ 
sider  mission  phases  by  including  logical  "or"s  in  the  func¬ 
tion  requirements;  e.g.,  (A  and  B)  or  (A  and  C  and  D).  Al¬ 
though  each  phase  would  have  a  term  in  the  logical  expres¬ 
sion,  it  could  be  reduced  to  a  few  dominant  terms.  In  this 
manner,  varying  mission  requirements  could  be  analyzed  with 
a  static,  structural  model. 

The  mission  scenarios  which  have  been  identified 
for  analysis  are  listed  in  Table  1  (References  10,  11,  and  12). 
These  scenarios  will  be  used  to  analyze  the  ICNIA  systems  A 
and  B.  Interdiction/Offensive  Counter  Air  will  be  used  as  a 
baseline  for  analysis. 


TABLE  1.  MISSION  REQUIREMENTS 


SCENARIO 

CRITICAL  FUNCTIONS 

Interdiction/Offensive 
Counter  Air 

UHF ,  JT1DS ,  GPS,  IFFT 

Close  Air  Support 

HF,  VHF ,  UHF,  SEEK  TALK, 
SINCGARS ,  JTIDS,  IFFT 

Defensive  Counter 

Air 

UHF,  VHF,  SEEK  TALK,  IFFI , 
IFFT 

"Generic" 

ILS,  UHF,  A/J  VOICE,  GPS, 
TACAN,  IFFT 

"Most  Stringent" 

Simultaneous 

Requirements 

HF,  VHF,  VHF  (GUARD),  UHF, 
UHF  (GUARD),  JTIDS,  IFFT, 
IFFI 

The  functions  listed  for  these  scenarios  are  those 
necessary  for  survival/safety  and  mission  success.  Alterna¬ 
tive  requirements  keyed  only  to  survival  could  also  be  used 
to  assess  the  impact  of  the  system  on  aircraft  losses. 


2.4  APPLICATION  TO  AN  EXAMPLE  ARCHITECTURE 

A  simple  example  of  a  fault-tolerant  architecture 
is  discussed  here  to  illustrate  MIREM  capabilities.  The 
structure  is  shown  in  Figure  4.  Low-band  functions  require 
one  of  the  two  low-band  receive  front  ends;  hence,  they  form 
the  pool  B.  Low-band  functions  also  require  preprocessors 
in  the  set  C  or  D.  The  UHF  and  S1NCGARS  functions,  for  ex¬ 
ample,  require  a  total  of  two  of  the  five  preprocessors. 
Preprocessors  in  set  C  can  be  used  only  if  certain  other 
components  in  the  larger  group  II  are  up.  Similarly,  the 
set  D  depends  on  components  in  group  III. 

This  two-level  structure  is  typical  of  those  found 
in  ICNIA  designs  (References  13  and  14).  MIREM  classifies  C 
and  D  as  pools  and  II  and  III  as  parallel  chains.  Pools  A 
and  B  can  be  considered  a  series  chain.  Connection  between 
these  parallel  chains  is  through  the  series  chain  (1).  Pool 
boundaries  are  defined  by  the  first  level  of  reconfigurabil¬ 
ity;  parallel  chains  are  defined  by  the  second. 


Figure  4.  A  Simplified  Fault-Tolerant  Architecture 
(CNI  Receive  Functions) 
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The  input  data  required  by  MIREM  for  each  pool  are 
shown  in  Table  2.  The  table  indicates  that  GPS,  for  example, 
requires  one  L-band  receiver  front  end,  three  preprocessors, 

80%  of  the  capacity  of  a  signal  processor,  one  power  supply, 
and  one  controller.  The  manner  in  which  functions  interact 
is  given  under  pool  type.  Timesharing  and  contending  pools 
are  listed  as  type  C;  noncontending  pools  are  listed  as  type 
N.  Pool  type  dictates  how  utilizations  are  combined  across 
functions.  For  example,  the  combination  of  UHF  and  SINCGARS 
requires  two  preprocessors  but  only  one  front  end.  Table  2 
also  shows  the  number  of  components,  or  capacity,  and  the 
component  failure  rate  in  each  pool.  Components  within  a 
pool  are  assumed  to  be  identical. 

Two  other  pool  types  are  also  considered.  A  set  of 
pools,  one  in  each  parallel  chain,  is  shared  (type  S)  if  the 
pool  in  one  chain  can  be  used  by  functions  allocated  to  another 
chain.  Chain- fail  pools  (type  F)  are  those  which,  upon  fail¬ 
ure,  prevent  any  of  the  pools  in  the  chain  from  being  utilized. 
In  this  example  the  signal  processors  are  connected  by  a 
data  bus,  so  that  they  are  shared  by  chains  II  and  III. 

Loss  of  a  power  supply  prevents  any  of  the  pools  in  that 
chain  from  being  used. 

Many  reconf igurable  designs  can  be  modeled  by  the 
pool/chain  concept.  However,  care  must  be  taken  to  represent 
failure  modes  properly,  particularly  for  switching  and  con¬ 
trol  resources.  The  interpretation  of  backup  components  as 
a  pool,  i.e.,  components  that  are  in  parallel,  assumes  that 
the  backup  will  take  over  when  a  component  fails.  This  is 
accomplished  in  ICN1A  through  Built-In  Test  (BIT)  equipment, 

RF  switching  and  flexible  processor  interconnections,  all 
coordinated  by  a  control  processor.  Failures  in  these  com¬ 
ponents  can  be  modeled  as  an  additional  pool.  The  fact  that 
not  all  failures  can  be  detected  by  BIT,  however,  is  not 
modeled . 


2 . 5  RESULTS 

Reliability  results  are  presented  in  this  section 
for  the  example  introduced  in  Section  2.4.  Table  3  shows 
MTBFF  and  failure  resiliency  for  each  function  considered  in¬ 
dividually  and  independent  of  any  mission.  UHF  and  SINCGARS 
both  have  very  good  reliability.  This  is  explained  by  the 
fact  that  no  single  component  failure  can  make  these  functions 
unavailable.  GPS,  being  restricted  to  chain  II,  has  several 
critical  components,  thus  it  exhibits  a  low  MTBFF.  The  fault 
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System  reliability  in  a  mission  context,  expressed 
by  MTBCF,  is  considerably  lower.  Two  mission  scenarios  are 
considered  in  Table  4,  one  requiring  all  three  functions 
simultaneously,  and  one  requiring  only  UHF  and  SINCGARS. 

Both  missions  are  three  hours  in  length.  For  Scenario  1, 
fault  tolerance  only  extends  the  MTBF  of  224  hours  to  a  MTBCF 
of  249  hours,  whereas  for  Scenario  2  the  increase  is  dramatic. 
Hence,  failure  resiliency  is  very  dependent  on  the  mission 
scenario.  Only  2.5%  of  the  critical  failures  for  Scenario  1 
occur  in  chain  I,  with  the  rest  occurring  in  the  parallel 
chains  II  and  III.  If  the  functions  are  not  required  simul¬ 
taneously,  the  MTBCF  for  Scenario  1  increases  to  389  hours, 
with  a  failure  resiliency  of  1.74. 

TABLE  4.  MISSION  RELIABILITY 


MISSION  SCENARIO 

MCSP 

(3-hour  mission) 

MTBCF 

(hours) 

FAILURE* 

RESILIENCY 

1  GPS,  UHF  AND 

SINCGARS  required 
simultaneously 

2  UHF  and  SINCGARS 

0.9880 

249 

1.11 

required 

simultaneously 

0.999996 

1379 

6.15 

*FaiIure  Resiliency  =  MTBCF/MTBF;  MTBF  =  224  hours 


A  major  advantage  of  MIREM  as  a  design  tool  is  its 
ability  to  evaluate  the  impact  of  proposed  design  changes. 

Table  "5  shows  the  sensitivity  of  MCSP  to  redundancy  levels 
using  the  architecture  discussed  above  as  the  baseline. 

Adding  a  second  signal  processor  to  chain  II,  for  example, 
reduces  the  probability  of  mission  failure  (1  -  MCSP)  by 
10%.  Additional  preprocessors  improve  reliability  dramat¬ 
ically  because  of  their  high  failure  rate  and  because  all 
five  are  required  for  this  scenario.  Other  mission  scenar¬ 
ios  would  show  different  sensitivities. 

Table  6  gives  the  sensitivity  of  MCSP  to  the  degree 
of  reconfigurability  of  the  system.  The  primary  restriction 
to  reconfigurability  is  that  GPS  must  use  chain  II.  Adding 
the  appropriate  switching  and  a  third  preprocessor  to  chain  III, 
so  that  GPS  can  use  either  chain,  has  a  large  reliability 
payoff.  On  the  other  hand,  reducing  reconfigurability  by 
eliminating  the  data  bus  between  the  signal  processors  does 
not  significantly  degrade  reliability. 
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TABLE  5.  SENSITIVITY  OF  MCSP  TO  REDUNDANCY  LEVELS 

FOR  SCENARIO  1 


REDUNDANCY  OPTION 

ir 

NEW 

X  REDUCTION  IN 

BASELINE  ARCHITECTURE 

PROPOSED  MODIFICATION 

MCSP 

MISSION  FAILURES 

2  Signal  Processors 

3  Signal  Processors 
(2  in  chain  II) 

0.9892 

10 

5  Preprocessors 
(3  in  chain  II,  2  in 

6  Preprocessors 
(4  in  chain  II) 

0.9970 

75 

chain  III) 

6  Preprocessors 

0.9916 

30 

(3  in  chain  III) 

1  L-band  Receiver- 

2  L-band  Receivers 

0.9883 

3 

♦Baseline  MCSP  =  0.9880 


TABLE  6.  SENSITIVITY  OF  MCSP  TO  RECONFIGURABILITY 

FOR  SCENARIO  1 


RECONFIGURABILITY  OPTION 

* 

NEW 

MCSP 

X  REDUCTION  IN 
MISSION  FAILURES 

BASELINE  ARCHITECTURE 

PROPOSED  MODIFICATION 

m  l  ! 

Share  signal  processors 
between  chains 

Separate  signal  processor 
for  each  chain 

0.9880 

0 

GPS  aust  use  chain  II 

GPS  can  use  chain  II 
or  III 

(add  3rd  preprocessors 
to  chain  III) 

0.9970 

75 

♦Baseline  MCSP  =  0.9880 
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2.6 


CONCLUSIONS 


A  structural  reliability  model  has  been  presented 
which  can  represent  the  features  of  integration  and  fault 
tolerance  in  complex  systems.  The  model  focuses  on  dynamic 
reconfigurability  and  does  not  consider  the  issues  of  Built- 
In  Test  (BIT)  coverage,  software  inadequacies  or  failures 
and  cabling  failures.  Several  conclusions  can  be  drawn  from 
the  reliability  example  which  was  analyzed: 

1.  Single  components  that  can  cause  sys¬ 
tem  failures  (critical  failures),  if 
they  exist,  are  the  single  most  im¬ 
portant  factor  in  Mission  Completion 
Success  Probability  (MCSP)  and  a  major 
factor  in  Mean  Time  Between  Critical 
Failure  ( MTBCF ) . 

2.  A  second  level  of  redundancy  (at  the 
LRU  level)  improves  reliability  only 
if  all  critical  functions  are  sup¬ 
ported  on  both  of  the  LRUs. 

3.  The  determination  of  which  functions 
are  critical  for  a  mission  and  whether 
they  are  required  simultaneously  can 
drastically  affect  MCSP. 

4.  Reconfigurability  (e.g. ,  inter-LRU 
connections)  between  components  that 
are  already  redundant  do  not  neces¬ 
sarily  enhance  reliability. 

Efficient  computation  of  reliability  measures  is  possible 
with  this  model.  Furthermore,  the  model  has  the  advantage 
of  not  requiring  highly  detailed  design  inputs. 


3.  LOGISTICS  SUPPORT  ANALYSIS 


The  potential  advantages  of  integrated,  fault- tolerant 
CNI  avionics  from  the  logistics  support  perspective  are  readily 
apparent.  Some  of  the  larger  impacts  are  expected  in: 

1.  Adoption  of  two-level  maintenance. 

2.  Faster  turnaround  at  the  flight-line 
level . 

3.  Greater  number  of  sorties  between  cor¬ 
rective  maintenance  actions. 

These  changes  offer  payoffs  in  both  Life-Cycle  Cost 
(LCC)  and  operational  readiness.  Integrated,  fault- tolerant 
architectures  exhibit  the  potential  for  increasing  readiness 
levels  above  those  of  existing  discrete  systems  at  equal  or 
lower  LCC.  This  feature  has  added  meaning  with  the  emerging 
requirements  of  sustained  combat  capability  under  a  bare 
base  (i.e.,  no  repair  capability)  environment  with  limited 
spares  budgets.  To  achieve  this  objective,  however,  emphasis 
needs  to  be  placed  not  only  on  hardware/software  reliability 
and  system  architecture,  but  also  on  Built-In  Test  (BIT), 
modularity,  and  support  strategies. 

This  section  presents  a  method  of  evaluating  the 
operational  readiness  payoff  of  integrated,  fault-tolerant 
avionics.  The  method  can  evaluate  alternative  repair  strat¬ 
egies  and  is  consistent  with  the  limited  data  available  during 
the  early  stages  of  system  design.  An  overview  of  the  meth¬ 
odology  is  shown  in  Figure  5.  The  applicability  of  previous 
work  is  discussed  in  Section  3.1.  The  logistics  support 
scenario  to  be  modeled  is  described  in  Section  3.2.  Sec¬ 
tion  3.3  presents  the  modeling  methodology.  Model  inputs 
for  an  example  architecture  are  defined  in  Section  3. A,  and 
results  are  given  in  Section  3.5.  Some  conclusions  are  drawn 
in  Section  3.6. 


3.1  FRONT-END  STUDY  FINDINGS 

Several  logistics  analysis  techniques  were  assessed 
as  to  applicability  to  analysis  of  integrated,  fault- tolerant 
architectures  using  both  conventional  and  innovative  mainte¬ 
nance  concepts.  In  particular,  six  models  were  evaluated  in 
some  depth.  Brief  discussions  of  these  six  models,  their 
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Figure  5.  Readiness  Methodology  Overview 

principal  features  and  applicability  to  the  ICNIA  analysis 
requirements  are  provided  in  the  following  paragraphs. 

ALPOS  -  The  Avionics  Laboratory  Predictive  Operations 
and  Support  model  (Reference  15)  is  a  parametric  operating 
and  support  cost  model  based  on  historical  data.  It  was 
derived  using  multiple  regression  techniques.  It  does  not 
capture  the  integrated  fault- tolerant  characteristics  of 
ICNIA  nor  can  it  model  the  innovative  maintenance  concepts 
applicable  to  ICNIA. 

LCOM  -  The  Logistics  Composite  Model  (Reference  16) 
is  a  discrete  event  simulation  model  based  on  Monte  Carlo 
techniques  which  captures  in  very  fine  detail  the  logistics 
structure  of  the  maintenance  scenario  and  the  hardware  struc¬ 
ture  (typically  of  a  major  weapon  system).  It  does  not  lend 
itself  to  early  design  work,  where  the  data  are  limited, 
although  it  could  be  streamlined  with  some  effort. 

OR LA  -  Optimum  Repair  Level  Analysis  (Reference  17) 
is  an  expected  value  model  for  determining  optimum  (least 
cost)  policies  for  repairing/discarding  LRUs  and/or  Shop 
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Replaceable  Units  (SRUs)  at  the  intermediate  or  depot  level. 
Determinations  are  based  on  spares,  support  equipment,  and 
other  support  costs.  The  technique  does  not  capture  the 
fault-tolerant  characteristics  of  ICNIA  since  it  is  driven 
largely  by  MTBF  and  traditional  support  factors. 

LSC  -  The  Logistics  Support  Cost  (Reference  18) 
model  consists  of  10  equations  which  address  support  costs. 
The  model  does  not  explicitly  capture  innovative  maintenance 
concepts  applicable  to  ICNIA. 

LCC2  -  The  Life-Cycle  Cost  Model  Version  2  (Refer¬ 
ence  19)  is  based  on  LSC  equations.  Although  it  provides 
flexibility  as  to  maintenance  concept  modeling,  it  does  not 
capture  readiness  factors  and  is  not  applicable  to  the  early 
design  phase. 

MOD -METRIC  -  The  MOD-METRIC  model  (Reference  20)  is 
a  set  of  sparing  algorithms  that  treats  the  multi-item,  multi 
echelon,  and  multi-indenture  inventory  problem  in  an  optimiza 
tion  framework.  The  model  is  limited  to  spares  and  does  not 
capture  the  relevant  logistics  factors  impacting  system 
readiness . 

Dyna-METRIC  -  The  Dyna-METRIC  model  (Reference  21) 
incorporates  dynamic  queueing  equations  that  extend  th''  MOD- 
METRIC  capabilities  to  transient  behavior  under  time -varying 
operations.  Like  MOD-METRIC,  the  model  addresses  optimal 
sparing  and  spares  availability,  but  Joes  not  capture  other 
logistics  factors  impacting  system  readiness. 

SOAR  -  The  Simulation  of  Operational  Availability/ 
Readiness  sodel  (Reference  22)  is  a  continuous  flow  simula¬ 
tion  model  based  on  system  dynamics  techniques  that  capture 
the  reliability  and  maintainability  parameters  of  a  system 
with  the  dynamics  of  logistics  support  at  a  single  base  in 
order  to  evaluate  mission  availability  at  the  squadron  or 
wing  level.  It  is  applicable  to  early  system  design  and  its 
network  flow  framework  can  be  extended  to  capture  innovative 
maintenance  concepts  for  ICNIA. 

The  main  conclusion  drawn  from  this  front-end  study 
is  that  no  single  technique  captures  all  of  the  ICNIA  analy¬ 
sis  requirements.  These  models  were  developed  with  specific 
objectives  in  mind  and  address  some  of  the  ICNIA  analysis 
needs  but  not  all.  The  SOAR  model  appeared  to  be  the  tech¬ 
nique  closest  to  the  ICNIA  logistics  support  analysis  re¬ 
quirements.  This  technique  was  selected  for  analysis  of 
operational  readiness  with  some  modification  for  capturing 
innovative  maintenance  concepts. 


3.2  LOGISTICS  SUPPORT  SCENARIO 

The  logistics  support  scenario  being  modeled  incor¬ 
porates  the  dynamics  of  aircraft  sortie  and  maintenance  oper¬ 
ations  at  a  single  site  (e.g.,  air  base)  from  the  perspective 
of  the  equipment  under  study  (Figure  6).  Weapon  system  sortie 
requirements,  expressed  in  terms  of  desired  number  of  sorties 
per  day,  are  generated  over  a  given  time  period.  The  weapon 
system  is  viewed  in  terms  of  the  equipment  under  study  and 
the  rest  of  the  aircraft  with  their  associated  reliability 
and  maintainability  parameters  and  support  resources.  Main¬ 
tenance  operations  and  logistics  support  at  the  organizational, 
intermediate  and  depot  level  maintenance  sites  are  represented. 

The  flight  line,  or  organizational-level,  maintenance 
activities  consist  primarily  of  removal  and  replacement  (R/R) 
of  Line  Replaceable  Units  (LRUs).  For  fault- tolerant  system 
applications,  R/R  actions  may  take  place  when  the  first  fail¬ 
ure  occurs  or  be  deferred  until  system  critical  failures 
occur  (i.e.,  loss  of  a  critical  function).  These  two  repair 
policies  vill  be  referred  to  as  immediate  and  deferred  repair, 
respectively.  Deferred  repair  is  an  innovative  maintenance 
concept  which  would  require  significant  institutional  changes 
to  implement.  The  procedure  would  rely  heavily  on  BIT  equip¬ 
ment  to  determine  system  health  and  an  intelligent  system  to 
make  the  repair/defer  decision  based  on  system  health  and 
the  type  of  mission  to  be  flown.  Compromise  maintenance 
policies ,  which  would  defer  repair  of  some  noncritical  fail¬ 
ures  and  repair  others,  could  be  developed  based  on  the  in¬ 
creased  risk  of  additional  failures  causing  a  critical 
failure  in  a  degraded  system.  For  the  mission  scenarios  and 
system  architectures  considered  to  date,  however,  the  increase 
in  risk  is  generally  small. 

After  flight  line  removal,  faulty  LRUs  then  enter 
the  intermediate,  or  I -level,  maintenance  shop  under  a  three- 
level  maintenance  policy  where  they  are  repaired  by  R/R  of 
the  faulty  SRUs.  If  a  two-level  maintenance  policy  is  con¬ 
sidered,  then  the  LRUs  are  sent  directly  to  the  depot  for 
repair.  The  depot  activities  consist  of  repair  of  the  faulty 
LRUs  or  SRUs,  depending  on  the  maintenance  concept. 

The  maintenance  resources  available  at  each  level 
depend  on  the  type  of  base  at  which  operations  are  being 
modeled.  Two  scenarios  have  been  identified.  These  scenar¬ 
ios  will  be  used  in  the  analysis  of  the  ICNIA  systems  A  and 
B  in  References  3  and  4,  respectively. 
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Figure  6.  Logistics  Support  Scenario 

Conventional  Support  Scenario 

This  scenario  is  representative  of  a  fixed-site 
main  operating  base.  The  following  maintenance  resources 
are  available  for  a  squadron  of  24  aircraft  and  systems: 

1.  Initial  spares  levels  set  at  one  spare 
for  each  LRU. 

2.  I -level  shop  for  LRU  repair,  including 
one  Automatic  Test  Equipment  (ATE) 
work  station  available  12  hours  each 
day  and  sufficient  manpower. 

3.  Depot  replenishment  for  SRUs  (three- 
level  maintenance)  or  LRUs  (two-level 
maintenance ) . 
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An  F-16  sortie  schedule  and  an  immediate  repair  policy  are 
used  as  a  baseline  for  this  scenario.  This  60  day  schedule 
consists  of  a  seven  day  surge  or  wartime  sortie  rate,  a  sus¬ 
taining  rate  for  days  eight  to  30  and  a  peacetime  sortie 
rate  of  0.7  sortie/aircraf t/day  for  the  last  30  days.  Im¬ 
mediate  repair  is  a  reasonable  baseline  assumption  for  this 
scenario,  since  maintenance  resources  are  not  unduly  stressed. 

Advanced  Support  Scenario 

This  scenario  represents  a  dispersed  operating  loca¬ 
tion,  known  as  a  bare  base  or  austere  site,  and  is  consistent 
with  the  Air  Force  2000  report.  The  following  maintenance 
resources  are  available  for  a  squadron  of  2<4  aircraft  and 
systems : 

1.  Initial  spares  levels  set  at  one  spare 
for  each  LRU. 

2.  An  Industrial  Maintenance  Facility, 
which  possesses  depot  repair  capabil¬ 
ities,  co-located  with  a  Main  Operating 
Base  ("Queen  Bee"  base). 

3.  Depot  SRU/LRU  replenishment  available 
only  after  the  initial  7-day  surge. 


A  maximum  sortie  schedule  is  used  as  a  baseline  for  this 
scenario,  putting  maximum  stress  on  the  maintenance  resources 
Under  this  schedule,  each  ICNIA-equipped  aircraft  is  launched 
as  soon  as  it  becomes  available  after  rearm/refuel  or  repair. 
Deferred  repair  has  the  potential  for  sustaining  more  sorties 
in  this  limited-resource  scenario,  and  is  used  as  a  baseline. 


3 . 3  METHODOLOGY 

Perhaps  the  most  operationally  significant  dimension 
of  logistics  support,  and  one  that  is  meaningful  early  in 
the  development  cycle,  is  readiness.  For  fighter  aircraft, 
readiness  can  be  viewed  as  the  ability  to  satisfy  an  immediate 
or  short-term  requirement  for  sorties.  To  evaluate  the  oper¬ 
ational  readiness  payoffs  of  integrated,  fault- tolerant  CNI 
systems,  a  logistics  model  that  captures  these  issues  and  is 
consistent  with  existing  data  during  the  early  stages  of 
system  design  is  needed.  The  Analytic  Sciences  Corporation 
(TASC)  has  developed  the  Simulation  of  Operational  Avail¬ 
ability/Readiness  (SOAR)  model  to  study  readiness  issues  for 
advanced  avionics  systems  (Reference  23). 
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SOAR  has  previously  been  applied  to  avionics  systems  such  as 
the  AN/ALQ-131,  Airborne  Self -Protection  Jammer  (ASPJ)  and 
Low-Altitude  Navigation  and  Targeting,  Infrared  for  Night 
(LANT1RN).  It  has  now  been  extended  to  accommodate  deferred 
repair  policies  applicable  to  integrated,  fault- tolerant 
avionics . 


SOAR  analyzes  the  dynamics  of  aircraft  sorties  and 
maintenance  operations  at  a  single  site  that  are  described 
in  the  logistic  scenarios  of  Section  3.2.  A  system  of  linear 
differential  equations  is  established  for  the  expected  flow 
rates  into  and  out  of  major  system  states.  Aircraft,  systems, 
LRUs  and  SRUs  move  through  ready,  failed,  and  under  repair 
states.  These  equations  are  solved  by  Euler's  single-step 
method,  starting  from  specified  initial  conditions.  Differ¬ 
ent  system  states  and  flow  diagrams  are  used  for  the  cases 
of  immediate  and  deferred  repair. 

Immediate  Repair 

The  basic  SOAR  flow  diagram  for  immediate  repair  is 
shown  in  Figure  7.  Sorties  are  generated  to  meet  the  planned 
sortie  rate  or  until  the  available  aircraft  and  systems  are 
exhausted.  The  expected  number  of  LRUs  returning  faulty  are 
routed  to  a  repair  queue,  are  repaired,  and  finally  are  reis¬ 
sued.  Additional  repair  states  and  delays  for  LRUs  and  SRUs 
that  depend  on  the  level  of  repair  are  not  shown. 


SOAR  Avionics  Model  (Immediate  Repair) 
28 


Figure  7. 


Deferred  Repair 

The  SOAR  flow  diagram  for  deferred  repair  is  shown 
in  Figure  8.  Unlike  immediate  repair,  deferral  of  repair 
until  a  critical  failure  occurs  results  in  a  changing  mission 
reliability.  For  highly  fault- tolerant  systems,  reliability 
decreases  as  a  system  continues  to  be  flown  without  repair. 
Hence,  the  age  or  operating  time  since  repair  of  each  system 
must  be  tracked  by  the  model .  Six  categories  of  system  age 
are  counted  as  separate  states  in  the  model,  with  varying 
Mission  Completion  Success  Probability  (MCSP).  Age  also 
impacts  which  LRUs  are  pulled  from  systems  returning  faulty. 
On  the  average,  more  LRUs  will  be  pulled  from  "old"  systems. 


Figure  8. 


SOAR  Deferred  Repair  Avionics  Model 


Once  the  faulty  LRUs  are  pulled,  the  remaining  LRUs 
return  to  "new"  status.  When  they  are  combined  with  other 
Ready  For  Issue  (RF1)  stock,  a  new  (age  zero)  system  reenters 
the  cycle.  The  remainder  of  the  model  is  equivalent  to  the 
immediate  repair  model. 

Measures  of  Effectiveness 


The  time  sequence  of  any  state  variable  or  rate  in 
the  model  can  be  obtained  as  an  output  from  SOAR.  Two  pri¬ 
mary  measures  of  operational  readiness  have  been  identified 
as  useful  outputs: 

(a)  Mission  Availability:  The  ratio  of 
the  actual  number  of  sorties  generated 
to  the  desired  number. 

(b)  Sortie  Generation  Rate:  The  number  of 
sorties  generated  per  day  per  aircraft. 

The  Primary  Aircraft  Authorization 
(PAA)  is  used  as  the  number  of  aircraft; 
less  aircraft  may  be  available  because 
of  attrition.  This  measure  is  of  inter¬ 
est  when  a  maximum  sortie  generation 
schedule  is  being  used. 


3.4  MODEL  INPUTS  FOR  AN  EXAMPLE  ARCHITECTURE 

The  inputs  required  by  SOAR  are  listed  in  Tables  7 
and  8.  The  values  listed  in  these  tables  are  for  the  base¬ 
line  case  reported  in  Section  3.5.  Parameters  that  differ 
from  these  values  for  the  conventional  and  advanced  deploy¬ 
ment  scenarios  are  defined  in  Section  3.2.  The  architecture- 
dependent  inputs  are  for  the  example  architecture  of  Sec¬ 
tion  2.4.  A  three-LRU  packaging  is  assumed,  with  one  LRU 
for  each  chain  as  depicted  in  Figure  4. 

The  reliability  inputs  in  Table  8  were  generated  by 
MIREM  using  the  equations  derived  in  Appendix  A.  The  archi¬ 
tecture  of  Section  2.4  and  the  mission  requirements  of  Sce¬ 
nario  2  were  used.  These  inputs  pertain  to  deferred  repair; 
conventional  MTBF  reliability  measures  are  used  as  inputs 
for  immediate  repair.  Each  age  interval  in  Table  8  corre¬ 
sponds  to  100  hours  of  operation  without  repair.  Note  that 
for  new  systems  an  average  of  just  over  one  LRU  contains  a 
failure  when  a  repair  action  occurs,  whereas  for  systems  of 
age  6,  two  LRUs  contain  failures.  In  addition,  the  distribu¬ 
tion  of  faulty  LRUs  shifts  toward  those  with  fault  tolerance 


30 


TABLE  7. 


SOAR  MODEL  INPUTS 


DESCRIPTION 

NAME 

VALUE 

Mission  Related 

Desired  Sortie  Rate  (  a ■  * 

(sort ies/aircr.ft/day )  ^  Peacetilie 

Interval  Between  Sorties  (hours) 

Attrition  Rate  l  Surge 

(fraction  of  sorties)  (Peacetime 

Start  of  Surge  Period  (hours) 

End  of  Surge  Period  (hours) 

Start  of  Peacetime  Period  (hours) 

Scenario  Length  (hours) 

Mission  Length  (hours) 

SX 

IX 

PX 

SINTVL 

WARF 

PARF 

STWAR 

ENDWAR 

STPEAC 

LENGTH 

ML 

* 

* 

0.7 

1 

0 

0 

0 

168 

720 

1440 

3 

Aircraft  Related 

Initial  Number  of  Aircraft 

INAC 

24 

Aircraft  Returning  Faulty  (fraction) 

DF 

0 

Turnaround  Time  for  Faulty  Aircraft  (hours) 

ATAT 

9 

Rearm/Refuel  Time  for  Good  Aircraft  (hours) 

FLDEL 

2t 

System  Related 

Initial  Number  of  (Age  1)  Systems 

PIRS1 

24 

LRU  Turnaround  Time  at  the  I-Level  Shop  (hours) 

MTAT 

4 

LRU  False  Removal  Rate  (fraction  of  LRU  failures) 

UFP 

0.1 

Support  System  Related 

I-Level  Support  Equipment  and  Manpower 

SAVAIL 

0.5 

Availability  (fraction  of  total  time) 

Number  of  I-Level  Testers 

NSE 

1 

,  „  (  LRU  1 

RFI1 

1 

Number  of  Ready  For  j 

RFI2 

1 

Issue  (RFI)  Spares  J  ^  3 

RFI3 

1 

Base  to  Depot  Shipping  Time  (hours) 

BDST 

360 

Depot  to  Base  Shipping  Time  (hours) 

BRST 

240 

*Value  it  classified. 

tA  one  hour  ream/refuel  tine  applies  to  the  conventional  and 
advanced  deployment  scenarios  defined  in  Section  3.2. 


TABLE  8.  SOAR  RELIABILITY  INPUTS  (DEFERRED  REPAIR) 


' —  * 

\A6E 

LRU  N. 

1 

2 

3 

4 

1 

5 

6 

PROBABILITY  OF  CRITICAL  FAILURE  DURING  MISSION 

- 

0.0004 

0.0007 

0.0010 

0.0013 

j 

0.0016 

PROBABILITY  THAT 

LRU  IS  FAULTY  AT  REPAIR 

1 

0.11 

0.15 

0.19 

0.22 

0.26 

0.29 

2 

0.92 

0.94 

0.95 

0.96 

0.97 

0.98 

3 

0.26 

0.40 

0.50 

0.57 

0.63 

0.68 

EXPECTED 
NUMBER  OF 
FAULTY  LRUs 

1.30 

1.49 

1.64 

1.76 

' 

1.86 

1.95 

*Age  of  a  system  refers  to  the  number  of  missions  flown  or  hours  of 
operation  without  undergoing  repair.  Six  age  ranges  are  established, 
each  representing  100  hours  or  33  missions. 

(LRUs  2  and  3)  as  time  since  repair  increases.  The  mission 
failure  probability  also  increases  with  age.  This  increasing 
"failure  rate"  is  due  to  the  high  fault  tolerance  of  the 
architecture  for  this  mission. 


3 . 5  RESULTS 

Readiness  results  are  presented  in  this  section  for 
the  architecture  introduced  in  Section  2.4  and  the  logistics 
parameters  listed  in  Section  3.4.  A  maximum  sortie  schedule 
and  a  high  aircraft  mission  capable  rate  are  used  for  this 
analysis  to  stress  the  maintenance  resources.  Figure  9  shows 
the  sortie  generation  rate  as  a  function  of  time  for  three- 
level  and  two-level  maintenance  concepts.  With  three-level 
maintenance,  the  spares  and  Intermediate- level  shop  through¬ 
put  are  sufficient  to  maintain  maximum  readiness.  Thus,  the 
system  under  study  has  no  impact  on  aircraft  availability. 
The  maximum  rate  of  4.8  sorties/aircraft/day  is  determined 
by  the  five  hour  cycle  of  mission  length  plus  rearm/refuel 
time . 
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Figure  9.  Maximum  Sortie  Generation  by  Level  of  Repair 

Linder  two-level  maintenance,  readiness  decreases  as  faulty 
LRUs  are  tied  up  in  the  longer  repair  pipeline  and  spares 
are  exhausted.  Equilibrium  is  reached  at  2.3  sorties/air- 
craft/day  when  the  LRU  failures  match  the  LRUs  returning 
from  depot. 

Sortie  generation  rate  can  be  increased  under  the 
two- level  concept  by  providing  more  spares  at  the  organiza¬ 
tional  level  or  by  adopting  a  deferred  repair  policy.  In 
Figure  10,  immediate  and  deferred  repair  policies  are  com¬ 
pared  under  two- level  maintenance.  The  deferred  repair 
policy  can  sustain  many  more  sorties  than  the  immediate 
repair  policy  and  nearly  matches  the  sorties  achieved  under 
three- level  maintenance.  Even  when  the  systems  age  and 
repair  actions  start  to  build  up,  the  high  MTBCF  places  less 
demand  on  the  LRU  repair  pipeline  and  a  higher  sortie  rate 
is  maintained. 
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Figure  10.  Maximum  Sortie  Generation  by  Repair  Policy 

A  six-LRU  packaging  arrangement  is  compared  with 
che  baseline  of  three  LRUs  in  Figure  11.  Immediate  repair 
is  assumed  so  that  only  the  traditional  reliability  inputs 
are  required  for  the  six  LRUs.  The  six-LRU  configuration 
(increased  modularity)  provides  a  higher  system  availability 
at  the  base  and  thus  a  higher  sortie  rate,  since  a  smaller 
piece  of  the  system  is  tied  up  in  the  repair  pipeline  for 
each  failure. 

The  readiness  benefits  of  three-level  maintenance 
and  increased  modularity  must  be  traded  off  against  the  asso¬ 
ciated  increased  costs.  The  readiness  benefit  of  deferred 
maintenance,  on  the  other  hand,  is  really  only  traded  against 
the  slight  increase  in  mission  failure  probability  (assuming 
that  BIT  and  resource  managements  features  are  already  in¬ 
cluded  for  reasons  of  fault  tolerance). 
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Figure  11.  Maximum  Sortie  Generation  by  Modularity 


3 . b  CONCLUSIONS 

A  technique  has  been  presented  for  assessing  the 
readiness  impact  of  integrated,  faul t- tolerant  systems.  The 
readiness  impact  of  two-  versus  three-level  maintenance, 
modularity  and  deferred  repair  have  been  illustrated.  Two 
conclusions  can  be  drawn  from  the  supportability  example 
which  was  analyzed: 

1.  Deferral  of  repair  until  a  critical 
failure  occurs  allows  a  high  sortie 
rate  to  be  sustained  for  a  longer  peri¬ 
od  without  repair.  The  payoff  is  sub¬ 
stantial  for  highly  fault-tolerant 
systems,  particularly  under  a  two-level 
maintenance  policy.  However,  some 
penalty  is  paid  in  MCSP  for  flying 
systems  that  contain  failed  components 
(less  redundancy). 
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2. 


High  reliability,  deferred  repair  poli¬ 
cies  and  increased  modularity  all  provide 
impetus  to  use  two-level  maintenance, 
eliminating  expensive  intermediate 
level  test  equipment. 

This  analysis  technique  is  applicable  to  1CN1A  architectures 
during  the  early  stages  of  design.  Specific  sortie  rate 
capabilities  for  ICNIA  will  depend  on  the  system's  reliabil¬ 
ity  parameters. 


4.  INTERIM  CONCLUSIONS  AND  RECOMMENDATIONS 


A  model  has  been  presented  which  can  represent  the 
features  of  integration  and  fault  tolerance  in  complex  sys¬ 
tems.  Techniques  for  assessing  the  reliability  and  logis¬ 
tics  support  impacts  of  such  an  architecture  were  developed. 
These  techniques  are  applicable  to  1CNIA  architectures  during 
the  early  stages  of  design.  The  reliability  example  illus¬ 
trates  the  ability  of  the  model  to  assess  redundancy,  re¬ 
configurability  and  component  quality  in  terms  of  mission 
reliability.  The  logistics  support  model  demonstrated  the 
readiness  impact  of  two-  versus  three-level  maintenance, 
deferral  of  repair  actions  until  a  critical  failure  occurs 
and  modularity. 

Several  conclusions  can  be  drawn  from  the  example 
which  was  analyzed.  For  reliability, 

1 .  Single  components  that  can  cause  system 
failures  (critical  failures),  IT  they 
exist,  are  the  single  most  important 
factor  in  Mission  Completion  Success 
Probability  (MCSP)  and  a  major  factor 
in  Mean  Time  Between  Critical  Failure 
(MTBCF ) 

2.  A  second  level  of  redundancy  (at  the 
LRU  level)  improves  reliability  only 

i  all  critical  functions  are  sup¬ 
ported  on  both  of  the  LRUs. 

3.  The  determination  of  which  functions 
are  critical  for  a  mission  and  whether 
they  are  required  simultaneously  can 
drastically  affect  MCSP. 

4.  Reconfigurability  (e.g.,  inter-LRU 
connections)  between  components  that 
are  already  redundant  does  not  neces¬ 
sarily  enhance  reliability. 

In  terms  of  supportabi  lity , 

1.  Deferral  of  repair  until  a  critical  fail¬ 
ure  occurs  allows  a  high  sortie  rate  to 
be  sustained  for  a  longer  period  without 
repair.  The  payoff  is  substantial  for 
highly  fault-tolerant  systems,  partic¬ 
ularly  under  a  two-level  maintenance 


policy.  However,  some  penalty  is  paid 
in  MCSP  for  flying  systems  that  contain 
failed  components  (less  redundancy). 


2.  High  reliability,  deferred  repair  poli¬ 
cies  and  increased  modularity  all  provide 
impetus  to  use  two-level  maintenance, 
eliminating  expensive  intermediate- level 
test  equipment. 

The  techniques  developed  have  the  advantage  of  not 
requiring  highly  detailed  design  and  logistics  inputs  and  of 
being  relatively  streamlined.  The  computerized  models  are 
amenable  to  interactive  use  and  could  be  hosted  on  a  mini¬ 
computer.  As  a  result,  the  techniques  could  be  applied  early 
in  the  design  phase  as  a  design  tool  to  aid  the  engineer  in 
building  reliability  and  supportability  into  an  integrated 
system. 


Several  areas  of  additional  research  are  suggested 
by  this  study.  The  reliability  model  developed  here  does 
not  include  the  effects  of  incomplete  or  faulty  BIT  coverage, 
which  could  cause  incorrect  switching  by  the  system  controller 
For  highly  fault-tolerant  systems,  this  effect  is  likely  to 
be  significant"  Software  reliability  and  Fault  tolerance , 
which  will  become  increasingly  important  in  these  systems, 
also  needs  further  research.  Maintenance  concepts  that  rely 
on  smart  systems  to  schedule  and  reduce  the  number  of  repair 
actions  pose  another  major  issue.  The  implications  of  at¬ 
tempting  to  institutionalize  such  a  concept  need  to  be  ex¬ 
plored.  Finally,  the  enhancement  and  possibly  integration 
of  the  models  developed  here  into  an  interactive,  user- 
friendly  package  is  required  if  they  are  to  be  used  by  design 
engineers  . 
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APPENDIX  A 

MISSION  RELIABILITY  MODEL  (MIREM) 


This  appendix  describes  the  equations  and  algorithms 
used  in  the  Mission  Reliability  Model  (MIREM).  The  model's 
basic  function  is  to  evaluate  the  combinations  of  failures 
which  result  in  failure  of  a  particular  mission  and  compute 
the  probability  of  such  failures.  Intrinsic  hardware  relia¬ 
bility  is  not  predicted  by  the  model,  but  treated  as  an  input. 
The  problem  to  be  solved  is  defined  in  Section  A.l  and  the 
approach  taken  is  presented  in  Section  A. 2.  The  reliability 
computations  are  developed  in  Sections  A. 3  and  A. 4.  Finally, 
some  additional  model  outputs  are  derived  in  Sections  A.S 
and  A. 6. 


A.l  THE  NETWORK  RELIABILITY  PROBLEM 


We  assume  that  the  system  consists  of  n  components 
or  "failure  units"  with  constant  failure  rate.  The  tradi¬ 
tional  approach  is  to  represent  system  health  by  X,  where  X. 
is  equal  to  one  if  component  i  is  up  at  the  end  of  the  mis-1 
sion  and  zero  otherwise.  For  each  mission  M  the  system 
structure  function  (Reference  7) 


♦M<x> 


1  if  the  mission  M  can  be  supported 
with  system  health  X 

0  otherwise 


is  determined  and  Pr{$M(X)  ~  1}  is  evaluated  enumeratively . 

Unfortunately,  this  approach  is  practical  only  if  the  system 
has  few  components  or  can  be  decomposed  into  modules  (Refer¬ 
ence  9)  of  intermediate  size.  Furthermore,  in  order  to  ana¬ 
lyze  various  mission  requirements  it  is  desirable  to  express 
4  at  the  individual  function  level  rather  than  for  a  mission. 
Missions  with  various  Communication,  Navigation  and  Identifi¬ 
cation  (CNI)  function  requirements  can  then  be  formulated  if 
a  "combining"  operation  is  defined  on  the  functional  struc¬ 
tures  . 


A. 2  A  SPECIAL  STRUCTURE  FOR  INTEGRATED 
RECONFIGURABLE  AVIONICS 

For  the  reasons  discussed  above,  will  not  be 
dealt  with  explicitly.  Instead,  the  special  structure  of 
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which  has  been  observed  in  proposed  1CNIA  architectures  will 
be  exploited  to  allow  more  efficient  computations. 


We  assume  that  the  system  can  be  described  by  either 
a  one-level  or  two-level  structure.  A  one-level  structure 
consists  of  a  set  of  k-of-n  modules  in  series .  These  k-of-n 
modules  will  be  referred  to  as  pools .  The  number  of  compo¬ 
nents  (k)  required  in  a  pool  depends  on  the  function  require¬ 
ments.  Pools  which  are  irrelevant  (i.e.,  k  equal  to  zero) 
with  respect  to  certain  functions  are  allowed. 

A  two- level  structure  consists  of  a  set  of  one-level 
structures.  Each  one- level  structure  will  be  referred  to  as 
a  chain.  Chains  are  either  in  "series"  in  the  sense  that 
all  functions  must  use  the  chain,  or  "parallel"  in  the  sense 
that  a  set  of  functions  is  supportable  if  there  exists  an 
allocation  of  functions  to  parallel  chains  such  that  each 
chain  can  support  its  functions.  Parallel  chains  need  not 
be  identical;  in  particular,  some  functions  may  be  restricted 
to  certain  chains. 

Two  slight  generalizations  to  this  model  are  also 
considered.  First,  pools  may  be  described  by  real-valued 
capacities  instead  of  integer-valued  numbers  of  components. 
Any  homogeneous  Markov  chain  can  be  used  to  describe  the 
degradation  of  pool  capacity  as  a  failure  process.  This 
extension  allows  system  resources  that  undergo  partial  fail¬ 
ures  to  be  modeled.  Second,  the  allocation  of  functions  to 
parallel  chains  may  not  be  strict  in  that  pools  may  be  shared 
between  parallel  chains.  For  example,  processing  resources 
iii  parallel  chains  may  be  shared  if  they  communicate  through 
data  buses. 


A. 3  POOL  CAPACITY  COMPUTATIONS 

For  the  pools  in  a  single  chain,  let 
C^  =  capacity  of  pool  i 
u^j  =  utilization  by  function  j  of  pool  i 
c  .  =  maximum  capacity  of  pool  i  (no  failures) 

HldX  f  1 

We  now  define  two  types  of  pools,  according  to  how  functions 
combine.  If  a  set  CF  of  critical  functions  is  required  simul¬ 
taneously,  the  total  requirement  for  pool  i  is 


i 


r 


i 


u .  . 

jeCF 

i  J 

max 

u .  . 

jeCF 

ij 

if  pool  i  is  contending 
if  pool  i  is  noncontending 


(A-l) 


If  the  functions  are  not  required  simultaneously,  all  pools 
are  considered  noncontending.  Pool  capacities  may  represent 
the  number  of  identical  components  in  a  pool  which  are  func¬ 
tioning,  the  number  of  signals  that  can  be  multiplexed  in  a 
single  component,  or  the  available  processing  rate. 

The  exponential  failure  time  distribution  implies 
that  is  a  homogeneous  Markov  chain  with  some  transition 

probability  matrix  (tpm)  P1.  If  a  pool  consists  of  identical 
components  (each  having  a  capacity  of  one),  then  its  tpm  is 

J  <J>  qk‘£  <l-q)£  ,  k  >  i 

Pk£  =  j  <A-2> 

'  0  otherwise 

where 


t  =  mission  length 
A  =  component  failure  rate 

Mission  Completion  Success  Probability  (MCSP) ,  de¬ 
fined  as  the  probability  that  the  set  of  functions  CF  is 
available  throughout  a  mission,  is  just 


MCSP  =  ]”[  PrfCj  >  ri)  (A-3) 

i 

which  can  be  easily  computed  from  the  P*  and  the  initial 
system  state  distributions. 


A. A  CHAIN  STRUCTURE  COMPUTATIONS 

We  now  consider  a  two- level  structure  containing 
more  than  one  chain.  The  computations  are  illustrated  only 
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for  the  case  of  two  chains  in  parallel.  Pools  are  divided 
into  the  following  types: 


F:  chain-fail  pools  (noncontending) 

S:  shared  pools 

N:  noncontending  pools,  excluding  types  F  and  S 
C:  contending  pools,  excluding  types  F  and  S 


A  pair  of  pools,  one  in  each  chain,  is  type  S  if  their  re¬ 
sources  can  be  used  by  functions  allocated  to  the  opposite 
chain.  Type  F  pools  are  those  which,  upon  failure,  prevent 
the  type  S  pools  in  the  chain  from  being  utilized.  Type  F 
pools  also  have  the  same  utilization  by  all  functions;  hence, 
when  they  fail,  the  entire  chain  fails.  The  remaining  pools 
are  classified  as  type  N  or  C  according  to  Equation  A-l.  If 
functions  are  not  required  simultaneously,  type  C  pools  are 
treated  as  type  N. 

The  state  of  a  chain  as  determined  by  its  pool  capac 
ities  implies  the  ability  to  support  certain  functions.  Let 


if  function  j  can  be  supported  on  the 
type  N  pools  on  chain  k, 


otherwise 


Xk  =  lXk],  jcCF 
1# 

UP  (t)  =  the  event  that  the  set  of  functions  CF  can 

be  supported  on  the  type  t  pools  on  chain  k 

1  ♦  2 

UP  (t)  =  the  event  that  the  set  of  functions  CF  can 

be  supported  on  the  type  t  pools  on  the  pair 
of  parallel  chains 

1  +2 

for  k  =  1,  2  and  t  =  F,  S,  N,  C.  The  event  UPA  *(C)  is  de- 

pendent  upon  X  in  that  an  allocation  of  functions  to  chains 
that  is  supportable  on  the  type  C  pools  must  be  consistent 
with  the  supportability  of  functions  on  the  type  N  pools. 

Similarly,  the  event  UP*+^(S)  is  dependent  on  UPk(F).  Apply 
ing  these  definitions. 


i 


MCSP  =  Pr{ UP 


(F , S ,N ,C) } 


=  Pr{ UP1+2(C) | UP1+2 (N) }  •  Pr{ UP1+2(N) } 

•  Pr{UP1+2(S) |UP1(F) ,UP2(F)}  •  Pr{UP1(F)}  Pr{UP2(F)} 

+  Pr{ UP1 (S ,N ,  C) }  •  Pr{UP1(F)}  •  [1  -  Pr{UP2(F)} J 

+  Pr{ UP2 (S ,N  ,C ) }  •  Pr{ UP2(F) }  •  |1  -  Pr{UP1(F)}] 

(A-A) 

The  three  terms  in  Equation  A-A  correspond  to  both  chains 
being  up  with  respect  to  type  F  pools,  chain  one  being  up 
and  chain  two  being  up. 

To  evaluate  the  first  term  we  condition  on  X  : 
Pr{UP1+2(C) |UP1+2(N)}  •  Pr{ UP1+2 (N ) } 


xJ+x2  >  1 


Pr{UP1  +  2(C)  JX1  =  x1,  X2  =  x^PrfX1  =  x^PrJX2  =  x2} 


(A-5) 


The  distribution  of  XK  is  determined  by  applying  the  single¬ 
chain  analysis  of  Section  A.  3  to  the  type  N  pools  for  all 

subsets  of  the  functions  CF,  giving  Pr{X  x}  for  all  x. 

Jt 

The  law  of  total  probability  is  then  used  to  obtain  Pr{X  =  x} 

The  type  C  pools  are  treated  as  follows.  We  assume 
that  type  C  pools  occur  in  pairs,  one  on  each  chain,  and  use 
the  index  i  to  refer  to  pairs  rather  than  individual  pools. 

k  k 

A  superscript  will  be  used  to  indicate  chain  (e.g.,  C. ,  c  . 

1  IU&X  y  X 

The  utilizations  u^j  ,  however,  are  assumed  to  be  the  same 

for  both  chains.  The  allocation  of  functions  to  chains  is 
represented  by 

(  1  if  function  j  uses  chain  1 


r  •  “  ' 

^  |  0  if  function  j  uses  chain  2 


Xs  (yjl  .  JeCF 


"S’**™ 


Let 


ri(*>  =  S  yJ  "ij  (A'6> 

jeCF 

The  conditional  event  in  Equation  A-5  occurs  if  there  is 
some  allocation  £  such  that 

1  ’  i  Z  i  5^  (A-7a) 

r±(z)  <  c\  (A-7b) 

r.(l  -  y)  £  cf  (A-7c) 

for  all  type  C  pools  i.  That  is,  functions  can  be  assigned 

only  to  chains  on  which  the  type  N  pools  can  support  them, 
and  the  total  function  requirements  must  not  exceed  the  type 
C  pool  capacities. 

A  necessary  condition  for  such  an  allocation  to 

exist  is 


r.(l*X2)  i  (A-8a) 

r^l-x1)  i  c\  ( A-8b ) 

rt<l)  <  c\  +  c\  (A-8c) 

max  u.  .  <  max  {cK  C?}  (A-8d) 

jcCF  1J 


for  all  type  C  pools  i.  The  probability  of  condition  A-7 
will  be  approximated  by  the  probability  of  condition  A-8. 
To  motivate  this  approximation,  note  that  condition  A-8c 
requires  that  sufficient  resources  be  available  to  perform 
the  required  functions.  Errors  occur  in  this  approximation 
only  in  the  probability  that  the  required  resources  will  be 
divided  in  unusable  proportions  on  the  two  chains.  In  the 
case  where  there  is  only  one  type  C  pool  pair  i,  u. .  =  1  and 
k 

takes  on  integer  values,  A-7  is  equivalent  to  A-8a-c. 

Condition  A-8d  addresses  the  case  of  some  u.  .  being  very 
large.  Hence  the  approximation  is  reasonable1. 
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valued , 


1 


Using  Equation  A-8  and  assuming  C*  is  integer- 
Pr{UP1+2(C)|X1  =  x1,  X2  =  x2} 


nmax ,i 

£ 

ieC  cl=k 


Pr(cJ  =  c1]  •  Pr(C?  >  £) 


(A-9) 


where 


k  = 


£  = 


maxlr^x1),  r^l)  -  c^  .) 

maxfr^x2),  r^l)  -  c1} 

2  1 

max{r. (x  ),  r.  (1)  -  c  ,  max  u.  .}  otherwise 
1  1  jeCF  1J 


if  c1  >  max  u . . , 
'jeCF  ^ 


The  type  S  pools  are  treated  as  follows.  We  assume 
that  type  S  pools  occur  in  pairs,  one  on  each  chain,  and  use 
the  same  notation  as  for  type  C  pools.  Because  the  paired 
resources  are  shared,  we  need  consider  only  the  combined 
capacity  of  the  two  pools: 

Pr{ UP1+2(S )  | UP1  (F) ,  UP2 (F) }  =  J~[  Pr{cJ  +  C2  >  r^l)} 

ieS 


■n 

ieS 


cmax , i 

£ 


=ri(l)-c 


2 

max,i 


Pr{c} 


c1}  Pr{C2  >  r.(l)  -  c1} 


(A-10) 


Applying  the  single-chain  analysis  to  the  type  F 

Ir 

pools  gives  Pr{UP  (F)}.  Combined  with  Equations  A-5,  A-9 
and  A-10,  this  completes  the  evaluation  of  the  first  term  of 
Equation  A-4. 


.  To  evaluate  the  second  and  third  term,  only 
Pr{UPK(S,N,C)}  is  needed.  It  is  obtained  by  applying  the 
single-chain  analysis  to  the  type  S,  N  and  C  pools  for  the 
set  of  functions  CF.  Note  that  if  not  all  functions  in  CF 

are  supported  on  chain  k  then  Pr{UP^(S,N,C)}  =  0. 


Equation  A-4  gives  MCSP  for  a  pair  of  parallel  chains. 
If  the  system  contains  several  chains  or  parallel  chain  sets 
in  series,  with  reliabilities  MCSP.,  the  combined  reliability 
is  1 

MCSP  =  MCSPi  (A-ll) 

chains  i 

A. 5  MEAN  TIME  BETWEEN  CRITICAL  FAILURE  ALGORITHM 

Another  measure  which  can  be  computed  by  MIREM  is 
Mean  Time  Between  Critical  Failure  (MTBCF),  defined  as  the 
expected  operating  time  without  repair  until  a  critical  func¬ 
tion  is  lost,  starting  with  full  system  capacity.  Let  MCSP(t) 
be  the  weapon  system  reliability  for  an  operating  time  of  t 
hours .  Then 


MTBCF 


t  dMCSP(t) 


■/' 


MCSP( t )dt 


(A-12 ) 


This  integral  is  evaluated  in  MIREM  using  the  trapezoidal 
rule  with  a  variable  step  size  which  can  be  modified  by  ex¬ 
ploration.  Letting  F(t)  =  MCSP(t),  the  algorithm  proceeds 
as  follows. 


1. 


Select 
dt  =  t 


a  , 
1  ' 


6 ,  c  .  ,  e  ,  and  t,  . 

min  rel  1 

t  ,  k  =  1 . 


Initialize  tQ  =  0, 


=  ln(F<to)/F(t1)]/dt 
4  =  (F(to)  -  FUpjdt/2 
MTBCF  =  4 


2.  dt(fi)  =  0.86  dt/U  *k) 
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3.  If  k  <  2,  dt  =  min{a  dt,  dt(6)}. 


If  k  >  2,  dt(erel)  = 


8F(tk-l)(0-8e  — 


rel 


d2F 


dt' 


1/2 


and  dt  =  min{a  dt,  max{dt(6),  dt(ere^)}}. 


4.  k  =  k  +  1 


d2F  _  0 
dt2  " 


F(tk)-F(tk.i>  F(Vl)-F(tk.2) 

-  t, 


k-1 


k-1  k-2 


^k  ”  tk-2^ 


Ak  =  ln(F(tk.1)/F(tk))/dt 
A  =  [F(tk_1)  -  F(tk)]dt/2 


2- 

6.  If  £rel  <  (dt)2  /  |8F(tk_1H 

dt 


and  6  <  A  Ak  then  set  dt  =  dt/2  and  go  to  step  5. 


7.  MTBCF  =  MTBCF  +  A 


8.  If  F<tk)/AR  >  0.1 


a  <  |Ak  ~  *k-l 1 

£»in  ^(tk  -  tk_!> 


then  go  to  step  2.  Otherwise,  set 


MTBCF  =  MTBCF  +  F(tk)/\k 


and  stop. 
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This  algorithm  is  based  on  the  assumption  that,  at 

least  for  large  t,  F(t)  can  be  approximated  by  ae  Local 

estimates  of  X  serve  as  a  basis  for  selecting  a  step  size, 
dt(6),  which  will  include  the  desired  fraction  6  of  the  en¬ 
tire  integral.  Estimates  of  X  are  also  used  as  a  stopping 
criterion.  If  the  relative  change  in  X  is  less  than  e 

mm 

per  unit  change  in  t,  it  is  assumed  that  the  remainder  of 

F  has  a  constant  failure  rate  and  it  is  integrated  analytically. 
The  parameter  c  ^  provides  an  alternative  basis  for  increas¬ 
ing  the  step  size,  based  on  the  average  relative  error  in  F 
calculated  from  its  second  derivative.  The  scaling  parameter 
a  sets  a  limit  on  how  rapidly  step  size  can  increase. 

The  parameters  values  that  were  used  in  this  study 

are 


a  =  4 


6 


e  . 

mm 


e  rel 
tl 


0.025 

0.00005  hrs 
0.005 
3  hrs 


Tests  indicate  that  the  MTBCF  accuracy  obtained  using  these 
values  is  better  than  0.5%,  while  an  average  of  only  22  func 
tion  evaluations  was  required.  These  results  suggest  that 
the  algorithm  is  more  efficient  for  the  life  distributions 
considered  than  a  general  purpose  routine. 


A.  6  SIMULATION  OF  OPERATIONAL  AVAILABILITY/READINESS 
(SOAR)  RELIABILITY  INPUTS 

The  MCSP  capability  of  MIREM  also  serves  as  a  basis 
for  computing  the  reliability  inputs  used  in  the  Simulation 
of  Operational  Availability/Readiness  (SOAR)  model  to  evaluate 
deferred  maintenance  policies.  In  particular,  we  consider  a 
maintenance  policy  of  repairing  only  after  missions  in  which 
critical  failures  occur  and  of  replacing  all  LRUs  which  contain 
failures.  The  SOAR  inputs  are: 

MCSP(t;t)  =  the  probability  of  completing  a  mission  of 
length  t  with  no  critical  failures  for  a 
system  that  has  operated  t  hours  since 
repair  with  no  critical  failures 
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( t  , t  +  t )  =  the  probability  that  an  LRU  consisting  of  the 
set  of  components  C  contains  a  failure  x  +  t 
hours  after  repair  given  that  a  critical 
failure  occurred  between  x  and  x  +  t. 

Both  the  weapon  system  reliability  and  the  probability  of 
pulling  an  LRU  depend  on  the  time  since  repair  because  of 
the  build-up  of  noncritical  failures. 

Following  Reference  8,  let 


T.  =  operating  time  since  repair  at  which  component  i 
1  fails 

=  time  of  first  failure  in  the  set  of  components  C 

Tg  =  time  at  which  a  critical  failure  occurs  in  the 
system 

F(  •  )  =  vector  distribution  function  of  [T\) 

F(. )  =  1  -  £(• ) 

h(-)  =  system  (i.e.,  critical  failure)  reliability 
function 

(lc,x)  =  the  vector  x  with  all  components  in  the  index 
set  C  replaced  by  1. 

These  definitions  allow  us  to  represent  conditional 
failure  probabilities  (see  also  Reference  24): 

h(F( t ) )  =  Pr{Tg  >  t}  (A-13 . a) 


h(lc,F(t))  =  Pr{Tg  >  t|Tc  >  t}  (A-13 . b) 

In  these  terms 


MCSP(t;x)  =  Pr{Tg  >  x  +  t|Tg  >  x} 

=  Pr{Tg  >  x  +  t}/Pr{Tg  >  x} 


=  MCSP(x  +  t)/MCSP(x ) 


(A-14) 


and 


Rr(x,x+t)  =  Pr{Tr  <  i  +  1 1  t  <  <_  i  +  t } 


=  1  - 


Pr{ x  <  Tg  <  t  +  t|Tc  >  x  *  t}Pr{Tc  >  x  +  t} 
Pr{ x  <  Tg  <  x  +  t} 


=  1 


(h(lc,F(x))  -  h(lc,F(x+t))JFc(i+t) 
h(F(x  ) )  -  h(F(x  +  t)) 


(A-15 ) 


To  evaluate  Equation  A-15  using  MIREM,  we  observe  that 


Fc(t)  =  Pr[Tc  >  t} 


*  n 


(A-16.a) 


ieC 


h(F( t) )  =  MCSP(t) 


(A-16.b) 


and  that  h(lc,F(t))  can  be  evaluated  in  the  same  fashion  as 
h(F(t)>  if  we  first  set  A^  =  0  for  ieC. 
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APPENDIX  B 


A/J 

BIT 

CNI 

GPS 

HF 

ICNIA 

IFFI 

IFFT 

ILS 

JTIDS 

LCC 

LRU 

MCSP 

MI  REM 

MTBCF 

MTBF 

MTTR 

RFI 

R/R 

SDU 


GLOSSARY 

Anti- jam 
Built-In  Test 

Communication,  navigation  and  identification 
Global  positioning  system 

1.  High  frequency,  2.  HF  clear  voice  communication 
set,  AN/ARC- 190 

Integrated  communication,  navigation  and  identi¬ 
fication  avionics 

Identify  friend-or- foe ,  interogator  set, 

AN/APX-76B 

Identify  friend-or- foe ,  transponder  set, 

AN/APX- 101 

Instrument  landing  system,  AN/ARC- 108 

Joint  tactical  information  distribution  system 

Life-cycle  cost 

Line  replaceable  unit 

Mission  completion  success  probability 

Mission  reliability  model 

Mean  time  between  critical  failure 

Mean  time  between  failure 

Mean  time  to  repair 

Ready  for  issue  spare  part 

Remove  and  replace  maintenance  action 

Secure  data  unit 
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SEEK  TALK 


UHF  anti- jam  voice  communication  set  (to  be 
replaced  by  HAVE  CLEAR) 


SINCGARS 

SOAR 

SRU 

TACAN 

UHF 

VHF 


Single  channel  ground  and  airborne  radio 
subsystem 

Simulation  of  operational  availability/readiness 
Shop  replaceable  unit 

Tactical  air  navigation  set,  AN/ARN-118 

1.  Ultra-high  frequency,  2.  UHF  clear  voice 
communication  set,  AN/ARC-164 

1.  Very  high  frequency,  2.  VHF  clear  voice 
communication  set,  AN/ARC-186 
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